-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[move-compiler] Added dot chain parsing resilience #17106
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Ignored Deployments
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems rather subtle and I don't quite follow all of it, so just some initial questions :)
@@ -117,12 +117,6 @@ fn unexpected_token_error_( | |||
)) | |||
} | |||
|
|||
fn add_type_args_ambiguity_label(loc: Loc, mut diag: Box<Diagnostic>) -> Box<Diagnostic> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this go away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was no longer used after the "main" parser resilience PR (#16673) landed. This was due to the fact that generating this additional label was predicated on parse_comma_list
returning an error which it no longer does. It just got more explicit in the current PR as parse_optional_type_args
(wrapping parse_comma_list
) now also no longer returns an error.
context.add_diag(*diag); | ||
let end_loc = context.tokens.previous_end_loc(); | ||
let loc = make_loc(context.tokens.file_hash(), start_loc, end_loc); | ||
sp(loc, Symbol::from("")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this symbol here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we are trying to do here is to parse a piece of code like this:
fun foo(s: SomeStruct) {
s.
}
This is happening in parse_dot_or_index_chain
which contains a loop parsing a potentially multi-segment dot chain (e.g., s.foo().f
).
Considering the example above, we would first (before the loop even starts) parse s
, then encounter a dot and try to parse whatever comes after the dot. In our example, this would fail as due to }
being encountered. Still, we want the partial access chain to be available for auto-completion so that we can determine the type of s
, so we "complete" the chain with a fake (empty) identifier, and keep going with the parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we will end up with a really strange error message Unbound field '' in 'a::m::AnotherStruct'
Which feels like something you maybe want to suppress? Could we add a new node or some sort of error node that the IDE can latch onto?
let n = match parse_identifier(context) { | ||
Ok(id) => id, | ||
Err(diag) => { | ||
done_parsing = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just break?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you consider an example I tried to put together in the other response, if we break here (instead of completing construction of Exp_::Dot
) we will not accomplish our goal as the AST node for the s.
statement will not be created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just return then?
let n = match parse_identifier(context) { | ||
Ok(id) => id, | ||
Err(diag) => { | ||
done_parsing = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just return then?
loop { | ||
if done_parsing { | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a while loop :)
context.add_diag(*diag); | ||
let end_loc = context.tokens.previous_end_loc(); | ||
let loc = make_loc(context.tokens.file_hash(), start_loc, end_loc); | ||
sp(loc, Symbol::from("")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we will end up with a really strange error message Unbound field '' in 'a::m::AnotherStruct'
Which feels like something you maybe want to suppress? Could we add a new node or some sort of error node that the IDE can latch onto?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment, and I think we would benefit on getting some thoughts from @cgswords.
Otherwise looks good!
ExpDottedAccess::UnresolvedError => { | ||
assert!(idx == num_accessors - 1); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it will not be enough for the IDE. I would imagine you will need something like UnannotatedExp_::InvalidAccess(Box<Exp>, Option<Name>)
which will let you float access to invalid fields and invalid methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand the limitation here and was hoping you could you give an example when this is not going to be sufficient.
I was going for the IDE to be able to understand what all valid prefixes of an otherwise only partially parsed chain are. As per added test it should understand what s
(in s.;
) and s.another_field
(in s.another_field.;
) are, even if parsing fails after the final dot (as for auto-completion we don't really care what "wrong thing" comes after the final dot)
fun foo(s: AnotherStruct) {
let _tmp1 = s.; // incomplete with `;` (next line should parse)
let _tmp2 = s.another_field.; // incomplete with `;` (next line should parse)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have s.foo_
in the IDE, you want all fields and functions that are prefixed by foo_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me personally, I never do s.
and wait, I always think I know the name of the function or field, and then am sadly mistaken. But autocorrect picks up about halfway through me typing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. From what I understand, though, integration with the IDE will work in a slightly different way. There is only one trigger event for auto-completion, and it's the appearance of the .
character. Whatever you type afterwards does not trigger further auto-completion events. This means, that the LSP implementation needs to provide all possible completions upon just seeing .
(as otherwise it will not have another chance). It may appear as if it's the LSP that does the filtering (in your example, show only functions prefixed by foo_
) but it's the "client" side that actually does it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can include IDE integration in this PR and make sure that everything works as expected but per @cgswords's previous request I am trying to keep the compiler PRs and IDE PRs separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I take it back :-) I thought it over and now I am understanding much better what you had in mind! My argument still partially stands, but there are cases when we'd need the "invalid" names. More changes (likely) coming!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworked things a bit. We indeed need new UnannotatedExp_::InvalidAccess
to support a dot-chain with no proper identifier after the dot (e.g., tmp.;
), even if the dot is on a separate line (that's why we need to preserve dot's location in conjunction with the type of expression that precedes it, so that the symbolicator can get the preceding expression type having only location of the dot at its disposal).
I don't think we need anything new to represent a dot chain with an "invalid" field or method name name (e.g., tmp.invalid
where invalid
does not represent a field or a method name). This will still parse as a Exp_::Dot
(rather than Exp_::DotUnresolved
) but with UnresolvedType
instead of some real one, and we just have to make sure that we process these correctly in the symbolicator.
_ => match parse_identifier(context) { | ||
Err(diag) => { | ||
context.add_diag(*diag); | ||
Exp_::DotUnresolved(Box::new(lhs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this turned out relatively cleanly!
c0fab10
to
992569e
Compare
@@ -1734,6 +1735,10 @@ impl AstDebug for ExpDotted_ { | |||
w.comma(&rhs.value, |w, e| e.ast_debug(w)); | |||
w.write("]"); | |||
} | |||
D::DotUnresolved(_, e) => { | |||
e.ast_debug(w); | |||
w.write("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: have this print something more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was supposed to be (fixed now):
e.ast_debug(w);
w.write("")
Exp_::DotUnresolved(loc, Box::new(lhs)) | ||
} | ||
Ok(n) => { | ||
if is_start_of_call_after_function_name(context, &n) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to check this either way so that we support s.foo.(x, y)
? I can imagine trying to get IDE help by hvaing s.foo.bar(x, y)
and removing bar
to try to get autocomplete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will work without any changes as s.foo.(x, y)
will parse to the DotUnresolved
case and auto-completion will work correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this to symbolicator tests, but I am also planning to have a more comprehensive auto-completion tests once auto-completion features start landing (i.e., have unit tests that actually check what would auto-completion actually insert in these partially-parsed cases).
let is_macro = if let Tok::Exclaim = context.tokens.peek() { | ||
let loc = current_token_loc(context.tokens); | ||
context.advance(); | ||
Some(loc) | ||
} else { | ||
None | ||
}; | ||
let mut tys = None; | ||
if context.tokens.peek() == Tok::Less | ||
&& n.loc.end() as usize == call_start | ||
{ | ||
tys = parse_optional_type_args(context); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to just call parse_macro_opt_and_tyargs_opt
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I can see. Previously, I just used what was already there, but now I rewrote it to use parse_macro_opt_and_tyargs_opt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small follow-ups.
One broader question is: why do we maintaing DotResolved
so deep into the compiler? We don't appear to be using it anywhere yet. Can you say what the plan is?
For an incomplete dot chain that does not have a parsable identifier after dot (e.g., |
let _tmp2 = s.another_field.; // incomplete with `;` (next line should parse) | ||
let _tmp3 = s.another_field. // incomplete without `;` (unexpected `let`) | ||
let _tmp4 = s; | ||
let _tmp = s. // incomplete without `;` (unexpected `}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error is a bit weird because it just says it's expected an identifier, when a number is also allowed. Maybe a last small thing to fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. This was the original behavior, though, and I simply kept it. That being said, it definitely makes sense to be more precise here so I changed it to generate a diagnostic according to your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One last little request.
7ee7c75
to
4d29768
Compare
4d29768
to
61466e4
Compare
Description
This PR adds parsing resilience to dot chains (e.g.,
some_struct.some_field
). It's a pre-requisite to adding auto-completion for struct fields and struct functions (aka methods) to the IDE.Implementation-wise, I was also trying making
parse_identifier
parser-resilient (have it always return a value instead ofResult
). This does not quite work, though, since inparse_dot_or_index_chain
we need to know that if an identifier fails to parse, we should stop parsing the chain itself. If theparse_identifier
never returns an error, we would simply continue parsing the chain and generate weird errors.Test plan
An additional compiler test has been added. Also a symbolicator test has been added to verify that prefixes of partially parsed dot chains are parsed correctly.
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.
some_struct.some_field
) as parsing errors no longer prevent compilation. Consequently, the compiler can reach later compilation stages where it might generate additional diagnostics.